skip to main content


Search for: All records

Creators/Authors contains: "Liu, Fangyu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Objective . Neural decoding is an important tool in neural engineering and neural data analysis. Of various machine learning algorithms adopted for neural decoding, the recently introduced deep learning is promising to excel. Therefore, we sought to apply deep learning to decode movement trajectories from the activity of motor cortical neurons. Approach . In this paper, we assessed the performance of deep learning methods in three different decoding schemes, concurrent, time-delay, and spatiotemporal. In the concurrent decoding scheme where the input to the network is the neural activity coincidental to the movement, deep learning networks including artificial neural network (ANN) and long-short term memory (LSTM) were applied to decode movement and compared with traditional machine learning algorithms. Both ANN and LSTM were further evaluated in the time-delay decoding scheme in which temporal delays are allowed between neural signals and movements. Lastly, in the spatiotemporal decoding scheme, we trained convolutional neural network (CNN) to extract movement information from images representing the spatial arrangement of neurons, their activity, and connectomes (i.e. the relative strengths of connectivity between neurons) and combined CNN and ANN to develop a hybrid spatiotemporal network. To reveal the input features of the CNN in the hybrid network that deep learning discovered for movement decoding, we performed a sensitivity analysis and identified specific regions in the spatial domain. Main results . Deep learning networks (ANN and LSTM) outperformed traditional machine learning algorithms in the concurrent decoding scheme. The results of ANN and LSTM in the time-delay decoding scheme showed that including neural data from time points preceding movement enabled decoders to perform more robustly when the temporal relationship between the neural activity and movement dynamically changes over time. In the spatiotemporal decoding scheme, the hybrid spatiotemporal network containing the concurrent ANN decoder outperformed single-network concurrent decoders. Significance . Taken together, our study demonstrates that deep learning could become a robust and effective method for the neural decoding of behavior. 
    more » « less
  2. Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm conducts adversarial weight perturbation, encouraging the model to converge to a flat minima. SAM finds a common adversarial weight perturbation per-batch. Although per-instance adversarial weight perturbations are stronger adversaries and can potentially lead to better generalization performance, their computational cost is very high and thus it is impossible to use per-instance perturbations efficiently in SAM. In this paper, we tackle this efficiency bottleneck and propose sharpness-aware minimization with dynamic reweighting (delta-SAM). Our theoretical analysis motivates that it is possible to approach the stronger, per-instance adversarial weight perturbations using reweighted per-batch weight perturbations. delta-SAM dynamically reweights perturbation within each batch according to the theoretically principled weighting factors, serving as a good approximation to per-instance perturbation. Experiments on various natural language understanding tasks demonstrate the effectiveness of delta-SAM. 
    more » « less
  3. null (Ed.)
    Abstract Many previous studies have shown that an Indian Ocean basin warming (IOBW) occurs usually during El Niño–Southern Oscillation (ENSO) decaying spring to summer seasons through modifying the equatorial zonal circulation. Decadal modulation associated with the interdecadal Pacific oscillation (IPO) is further investigated here to understand the nonstationary ENSO–IOBW relationship during ENSO decaying summer (July–September). During the positive IPO phase, significant warm sea surface temperature (SST) anomalies are observed over the tropical Indian Ocean in El Niño decaying summers and vice versa for La Niña events, while these patterns are not well detected in the negative IPO phase. Different decaying speeds of ENSO associated with the IPO phase, largely controlled by both zonal advective and thermocline feedbacks, are suggested to be mainly responsible for these different ENSO–IOBW relationships. In contrast to ENSO events in the negative IPO phase, the ones in the positive IPO phase display a slower decaying speed and delay their transitions both from a warm to a cold state and a cold to a warm state. The slower decay of El Niño and La Niña thereby helps to sustain the teleconnection forcing over the equatorial Indian Ocean and corresponding SST anomalies there can persist into summer. This IPO modulation of the ENSO–IOBW relationship carries important implications for the seasonal prediction of the Indian Ocean SST anomalies and associated summer climate anomalies. 
    more » « less
  4. Pretrained Transformers achieve remarkable performance when training and test data are from the same distribution. However, in real-world scenarios, the model often faces out-of-distribution (OOD) instances that can cause severe semantic shift problems at inference time. Therefore, in practice, a reliable model should identify such instances, and then either reject them during inference or pass them over to models that handle another distribution. In this paper, we develop an unsupervised OOD detection method, in which only the in-distribution (ID) data are used in training. We propose to fine-tune the Transformers with a contrastive loss, which improves the compactness of representations, such that OOD instances can be better differentiated from ID ones. These OOD instances can then be accurately detected using the Mahalanobis distance in the model’s penultimate layer. We experiment with comprehensive settings and achieve near-perfect OOD detection performance, outperforming baselines drastically. We further investigate the rationales behind the improvement, finding that more compact representations through margin-based contrastive learning bring the improvement. We release our code to the community for future research. 
    more » « less